Final Project DSA 301: Buffalo Crime Incidents

Nguyet Que Tran

Abstract

The crime situation, the safety of the area and the living environment are important issues. Through analyzing and visualizing Buffalo's crime data, the project provides a better view and insight into the current social situation in the area such as common crime types or dates and time frames, or the neigborhood often occurs criminal activities. Not only providing knowledge, this project also helps viewers to wake up and raise their vigilance.

Questions

  1. Which types of crime are mostly happend?
  2. What days crime mostly happened?
  3. What time of day crime mostly happened?
  4. Does crime occur more at weekend/night? What types of crime are happened more at weekend/night?
  5. What area/neighborhoods are high/low rate of crime?
  6. Are there any addresses that have more than 2 crimes occured?

By working with spatial attributions, this project focus on building customized analytical modules for processing and analysis of geospatial data. The goals of this project is to provide information about crime's locations in Buffalo by geospatial mapping such as 2D map, interactive point frequency maps, and interactive point distribution maps.

  1. Mapping crime locations by different conditions such as Crime Types, Neighborhoods.
  2. How are locations and frequencies of theft cases different to location and frequencies of homicide crime cases in Buffalo?
  3. Mapping that showing the number of confirmed crime cases by Buffalo Council Districts? Which Council Districts is most dangeous?

About the Dataset

Source: Buffalo Open Data - Crime Incidents

This dataset is information about crime incidents of Buffalo.

The dataset was created in September 6, 2017 and was updated in May 22, 2022.

There are total 281601 records and 29 attribute fields.

Contents

I. Exploratory Data Analysis

II. Data Cleaning

III. Visualization

IV. Spatial Data

I. Exploratory Data Analysis

In [166]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import math

As mentioned, there are 29 columns. I just choose to read specific 7 columns that are needed for this project.

In [167]:
# Read the dataset from url, add ?$limit=300000 to read all records
crime_url = 'https://data.buffalony.gov/resource/d6g9-xbgu.csv?$limit=300000'
crime = pd.read_csv(crime_url, usecols=['case_number','incident_datetime','parent_incident_type','hour_of_day','day_of_week',
                                        'address_1','neighborhood_1'])
crime.tail()
Out[167]:
case_number incident_datetime parent_incident_type hour_of_day day_of_week address_1 neighborhood_1
281597 13-0790754 2013-03-20T20:47:00.000 Breaking & Entering 20 Wednesday 100 Block MOSELLE ST Genesee-Moselle
281598 15-1050782 2015-04-15T20:01:00.000 Assault 20 Wednesday S PARK AV & PRIES AV Hopkins-Tifft
281599 06-1561060 2006-06-05T22:55:00.000 Robbery 22 Monday 1700 Block BROADWAY Genesee-Moselle
281600 17-2690527 2017-09-26T11:00:00.000 Breaking & Entering 11 Tuesday 200 Block GROTE ST Grant-Amherst
281601 16-2251020 2016-08-12T20:10:00.000 Theft 20 Friday 200 Block BRYANT ST Elmwood Bryant
In [168]:
crime.shape
Out[168]:
(281602, 7)
  • The data which use for this project contain 281602 records and 7 attributions.
In [169]:
crime.dtypes
Out[169]:
case_number             object
incident_datetime       object
parent_incident_type    object
hour_of_day              int64
day_of_week             object
address_1               object
neighborhood_1          object
dtype: object

Limitation of the dataset: Lacking numerical data.

The only numerical data which are useful and can combine with other data is Hour of Day.

Idealy crime data: contain information about number of injured people, dead people, etc.

In [170]:
crime['parent_incident_type'].value_counts()
Out[170]:
Theft                   123044
Assault                  57806
Breaking & Entering      53288
Theft of Vehicle         23086
Robbery                  18227
Sexual Assault            2497
Other Sexual Offense      2241
Homicide                   965
Sexual Offense             448
Name: parent_incident_type, dtype: int64
In [171]:
crime['day_of_week'].value_counts()
Out[171]:
Friday       42217
Saturday     41668
Monday       39989
Wednesday    39686
Tuesday      39422
Thursday     39341
Sunday       39274
Null             5
Name: day_of_week, dtype: int64
In [172]:
crime['day_of_week'].unique()
Out[172]:
array(['Thursday', 'Friday', 'Sunday', 'Monday', 'Saturday', 'Tuesday',
       'Wednesday', 'Null'], dtype=object)
  • There is namning error in column Day of Week. Names of day are duplicated by lower and upper cases.
In [173]:
crime['neighborhood_1'].value_counts()
Out[173]:
Broadway Fillmore     16152
Central               14996
Kensington-Bailey     14638
North Park            13643
Genesee-Moselle       12780
Schiller Park         11933
Elmwood Bidwell       11750
Elmwood Bryant        11306
Upper West Side       10714
University Heights    10609
West Side             10067
Kenfield              10025
Riverside              9257
Lovejoy                8816
Masten Park            8773
Lower West Side        7745
Hopkins-Tifft          7216
Delavan Grider         7181
Fillmore-Leroy         6673
Allentown              6535
Seneca-Cazenovia       6334
South Park             5943
MLK Park               5873
Parkside               5353
Fruit Belt             5275
West Hertel            5274
Black Rock             4568
Hamlin Park            4495
Pratt-Willert          4250
Grant-Amherst          4109
Ellicott               3729
Kaisertown             3482
Central Park           3329
Seneca Babcock         2996
UNKNOWN                2889
First Ward             1870
Name: neighborhood_1, dtype: int64
In [174]:
crime['hour_of_day'].describe()
Out[174]:
count    281602.000000
mean         11.805136
std           7.360526
min           0.000000
25%           6.000000
50%          12.000000
75%          18.000000
max          23.000000
Name: hour_of_day, dtype: float64
In [175]:
crime['hour_of_day'].unique()
Out[175]:
array([ 7, 23, 15, 22,  2, 17, 18, 20, 11, 12,  8, 14, 10,  1,  9, 21, 19,
       16,  6, 13,  4,  5,  0,  3])
In [176]:
# Year from datetime column
crime['incident_datetime'].str[0:4].value_counts()
Out[176]:
2007    21958
2009    21834
2010    21709
2012    20614
2011    20472
2006    19446
2013    18652
2014    17523
2015    17268
2016    16452
2018    15466
2017    15405
2019    13669
2008    13535
2020    12123
2021    11749
2022     3092
2005      317
2004       70
2000       57
2003       47
2002       33
2001       29
1996        8
1998        6
1970        5
1999        5
1960        4
1990        4
1963        3
1978        3
1993        3
1967        3
1981        3
1989        2
1988        2
1951        2
1995        2
1961        2
1986        2
1985        2
1992        2
1994        1
1980        1
1962        1
1910        1
1979        1
1984        1
1987        1
1991        1
1983        1
1914        1
1976        1
1972        1
1952        1
1997        1
Name: incident_datetime, dtype: int64

The dataset was created in 2017. The years have the highest and second highest number of crime cases is 2007 and 2009. This is the time frame of Great Recession - the crisis led to crisis in many financial fields such as credit, insurance, securities.

In [177]:
# Month  from datetime column
crime['incident_datetime'].str[5:7].value_counts()
Out[177]:
07    28845
08    28737
09    25979
06    25708
10    25541
05    24038
11    23111
12    22203
01    21039
04    20697
03    19032
02    16667
Name: incident_datetime, dtype: int64

The number of cases drops as the temperatery. July, August, September are warm months while March, February are colder months and have high average inches of snowfall. Cold weather with snow conditions of Buffalo effect the number of crime cases here. In cold months, crime were happened less frequency than it in warm months.

Check missing values

In [178]:
# number of missing values in each columns
crime.isnull().sum()
Out[178]:
case_number                0
incident_datetime          5
parent_incident_type       0
hour_of_day                0
day_of_week                0
address_1                 39
neighborhood_1          1024
dtype: int64

In total 279,677 cases:

  • There are 5 cases that are missed information about Incident Datetime.

  • 39 cases are missed address information.

  • 1024 cases are misses neighborhood information.

In [179]:
# Cases that do not have DateTime information
crime[crime['incident_datetime'].isnull()]
Out[179]:
case_number incident_datetime parent_incident_type hour_of_day day_of_week address_1 neighborhood_1
60920 14-0420506 NaN Theft 0 Null NaN UNKNOWN
63644 12-3390923 NaN Assault 0 Null GRANT ST & AMHERST ST UNKNOWN
64403 10-3130893 NaN Assault 0 Null 200 Block STEVENSON ST Seneca-Cazenovia
66461 11-0400654 NaN Breaking & Entering 0 Null BROADWAY & BAILEY AV UNKNOWN
84248 13-0720178 NaN Theft 0 Null 1 Block PLYMOUTH AV UNKNOWN
  • The exact time in Incident Datetime column are converted into only 24 hour in Hour of Day column

Group crime types by neighborhood

In [180]:
pd.set_option('display.max_rows',500)
crime.groupby(['neighborhood_1','parent_incident_type']).size()
Out[180]:
neighborhood_1      parent_incident_type
Allentown           Assault                  905
                    Breaking & Entering      794
                    Homicide                  10
                    Other Sexual Offense      29
                    Robbery                  397
                    Sexual Assault            50
                    Sexual Offense             3
                    Theft                   3881
                    Theft of Vehicle         466
Black Rock          Assault                  932
                    Breaking & Entering      963
                    Homicide                  11
                    Other Sexual Offense      34
                    Robbery                  274
                    Sexual Assault            36
                    Sexual Offense             7
                    Theft                   1904
                    Theft of Vehicle         407
Broadway Fillmore   Assault                 3955
                    Breaking & Entering     3509
                    Homicide                 109
                    Other Sexual Offense     112
                    Robbery                 1387
                    Sexual Assault           160
                    Sexual Offense            16
                    Theft                   5417
                    Theft of Vehicle        1487
Central             Assault                 3193
                    Breaking & Entering     1208
                    Homicide                  22
                    Other Sexual Offense     127
                    Robbery                  763
                    Sexual Assault           183
                    Sexual Offense            22
                    Theft                   8680
                    Theft of Vehicle         798
Central Park        Assault                  461
                    Breaking & Entering      702
                    Homicide                   3
                    Other Sexual Offense      28
                    Robbery                  224
                    Sexual Assault            15
                    Sexual Offense             8
                    Theft                   1628
                    Theft of Vehicle         260
Delavan Grider      Assault                 2131
                    Breaking & Entering     1453
                    Homicide                  52
                    Other Sexual Offense      90
                    Robbery                  531
                    Sexual Assault            94
                    Sexual Offense            29
                    Theft                   2126
                    Theft of Vehicle         675
Ellicott            Assault                  815
                    Breaking & Entering      827
                    Homicide                   5
                    Other Sexual Offense      25
                    Robbery                  230
                    Sexual Assault            36
                    Sexual Offense             8
                    Theft                   1464
                    Theft of Vehicle         319
Elmwood Bidwell     Assault                 1230
                    Breaking & Entering     2138
                    Homicide                  16
                    Other Sexual Offense      82
                    Robbery                  641
                    Sexual Assault            68
                    Sexual Offense            11
                    Theft                   6545
                    Theft of Vehicle        1019
Elmwood Bryant      Assault                 1494
                    Breaking & Entering     1738
                    Homicide                  11
                    Other Sexual Offense      41
                    Robbery                  671
                    Sexual Assault            85
                    Sexual Offense            13
                    Theft                   6334
                    Theft of Vehicle         919
Fillmore-Leroy      Assault                 1787
                    Breaking & Entering     1208
                    Homicide                  41
                    Other Sexual Offense      65
                    Robbery                  516
                    Sexual Assault            85
                    Sexual Offense             8
                    Theft                   2327
                    Theft of Vehicle         636
First Ward          Assault                  450
                    Breaking & Entering      451
                    Homicide                   5
                    Other Sexual Offense      22
                    Robbery                   89
                    Sexual Assault            13
                    Sexual Offense             1
                    Theft                    682
                    Theft of Vehicle         157
Fruit Belt          Assault                 1162
                    Breaking & Entering      738
                    Homicide                  22
                    Other Sexual Offense      46
                    Robbery                  359
                    Sexual Assault            65
                    Sexual Offense            17
                    Theft                   2448
                    Theft of Vehicle         418
Genesee-Moselle     Assault                 3433
                    Breaking & Entering     2968
                    Homicide                 100
                    Other Sexual Offense      99
                    Robbery                 1017
                    Sexual Assault           144
                    Sexual Offense            29
                    Theft                   3840
                    Theft of Vehicle        1150
Grant-Amherst       Assault                  774
                    Breaking & Entering      931
                    Homicide                   7
                    Other Sexual Offense      30
                    Robbery                  255
                    Sexual Assault            25
                    Sexual Offense             5
                    Theft                   1730
                    Theft of Vehicle         352
Hamlin Park         Assault                 1049
                    Breaking & Entering     1044
                    Homicide                  19
                    Other Sexual Offense      42
                    Robbery                  291
                    Sexual Assault            39
                    Sexual Offense             9
                    Theft                   1529
                    Theft of Vehicle         473
Hopkins-Tifft       Assault                 1595
                    Breaking & Entering     1067
                    Homicide                   9
                    Other Sexual Offense      71
                    Robbery                  305
                    Sexual Assault            80
                    Sexual Offense            16
                    Theft                   3533
                    Theft of Vehicle         540
Kaisertown          Assault                  849
                    Breaking & Entering      722
                    Homicide                   8
                    Other Sexual Offense      40
                    Robbery                  123
                    Sexual Assault            29
                    Sexual Offense            10
                    Theft                   1400
                    Theft of Vehicle         301
Kenfield            Assault                 2440
                    Breaking & Entering     2169
                    Homicide                  44
                    Other Sexual Offense     103
                    Robbery                  714
                    Sexual Assault            98
                    Sexual Offense            13
                    Theft                   3453
                    Theft of Vehicle         991
Kensington-Bailey   Assault                 2870
                    Breaking & Entering     3056
                    Homicide                  60
                    Other Sexual Offense     108
                    Robbery                 1061
                    Sexual Assault            85
                    Sexual Offense            16
                    Theft                   6110
                    Theft of Vehicle        1272
Lovejoy             Assault                 2112
                    Breaking & Entering     1836
                    Homicide                  25
                    Other Sexual Offense      63
                    Robbery                  520
                    Sexual Assault            92
                    Sexual Offense            17
                    Theft                   3432
                    Theft of Vehicle         719
Lower West Side     Assault                 1634
                    Breaking & Entering     1230
                    Homicide                  27
                    Other Sexual Offense      61
                    Robbery                  465
                    Sexual Assault            72
                    Sexual Offense             9
                    Theft                   3741
                    Theft of Vehicle         506
MLK Park            Assault                 1694
                    Breaking & Entering     1015
                    Homicide                  40
                    Other Sexual Offense      56
                    Robbery                  450
                    Sexual Assault            73
                    Sexual Offense            14
                    Theft                   1907
                    Theft of Vehicle         624
Masten Park         Assault                 2068
                    Breaking & Entering     1697
                    Homicide                  45
                    Other Sexual Offense      63
                    Robbery                  692
                    Sexual Assault            74
                    Sexual Offense             6
                    Theft                   3306
                    Theft of Vehicle         822
North Park          Assault                 1272
                    Breaking & Entering     1604
                    Homicide                  13
                    Other Sexual Offense      52
                    Robbery                  526
                    Sexual Assault            65
                    Sexual Offense            12
                    Theft                   9292
                    Theft of Vehicle         807
Parkside            Assault                  447
                    Breaking & Entering      967
                    Homicide                   3
                    Other Sexual Offense      21
                    Robbery                  258
                    Sexual Assault            33
                    Sexual Offense             5
                    Theft                   3274
                    Theft of Vehicle         345
Pratt-Willert       Assault                 1072
                    Breaking & Entering      821
                    Homicide                  21
                    Other Sexual Offense      39
                    Robbery                  311
                    Sexual Assault            52
                    Sexual Offense             5
                    Theft                   1557
                    Theft of Vehicle         372
Riverside           Assault                 1967
                    Breaking & Entering     1950
                    Homicide                  31
                    Other Sexual Offense      92
                    Robbery                  595
                    Sexual Assault            87
                    Sexual Offense            21
                    Theft                   3784
                    Theft of Vehicle         730
Schiller Park       Assault                 2973
                    Breaking & Entering     2991
                    Homicide                  62
                    Other Sexual Offense     103
                    Robbery                 1007
                    Sexual Assault            91
                    Sexual Offense            16
                    Theft                   3524
                    Theft of Vehicle        1166
Seneca Babcock      Assault                  699
                    Breaking & Entering      664
                    Homicide                   3
                    Other Sexual Offense      24
                    Robbery                  147
                    Sexual Assault            25
                    Sexual Offense             7
                    Theft                   1145
                    Theft of Vehicle         282
Seneca-Cazenovia    Assault                 1545
                    Breaking & Entering     1174
                    Homicide                  12
                    Other Sexual Offense      78
                    Robbery                  246
                    Sexual Assault            45
                    Sexual Offense            14
                    Theft                   2726
                    Theft of Vehicle         494
South Park          Assault                 1198
                    Breaking & Entering     1064
                    Homicide                   8
                    Other Sexual Offense      57
                    Robbery                  136
                    Sexual Assault            36
                    Sexual Offense             7
                    Theft                   2963
                    Theft of Vehicle         474
UNKNOWN             Assault                  577
                    Breaking & Entering      480
                    Homicide                   3
                    Other Sexual Offense      19
                    Robbery                  182
                    Sexual Assault            25
                    Sexual Offense             1
                    Theft                   1438
                    Theft of Vehicle         164
University Heights  Assault                 1807
                    Breaking & Entering     2550
                    Homicide                  32
                    Other Sexual Offense      78
                    Robbery                  919
                    Sexual Assault            78
                    Sexual Offense            11
                    Theft                   4278
                    Theft of Vehicle         856
Upper West Side     Assault                 1961
                    Breaking & Entering     2392
                    Homicide                  36
                    Other Sexual Offense      85
                    Robbery                  940
                    Sexual Assault           104
                    Sexual Offense            13
                    Theft                   4348
                    Theft of Vehicle         835
West Hertel         Assault                  933
                    Breaking & Entering      840
                    Homicide                  10
                    Other Sexual Offense      71
                    Robbery                  246
                    Sexual Assault            51
                    Sexual Offense            12
                    Theft                   2691
                    Theft of Vehicle         420
West Side           Assault                 2177
                    Breaking & Entering     2268
                    Homicide                  37
                    Other Sexual Offense      85
                    Robbery                  684
                    Sexual Assault           104
                    Sexual Offense            14
                    Theft                   3952
                    Theft of Vehicle         746
dtype: int64

II. Data Cleaning

Drop missing values

In [181]:
crime.dropna(how='any',inplace=True)
crime.shape
Out[181]:
(280535, 7)

Fix naming error of column Day of Week

In [182]:
crime['day_of_week']=crime['day_of_week'].str.upper()
In [183]:
crime['day_of_week'].value_counts()
Out[183]:
FRIDAY       42087
SATURDAY     41549
MONDAY       39808
WEDNESDAY    39518
TUESDAY      39259
THURSDAY     39175
SUNDAY       39139
Name: day_of_week, dtype: int64

III. Visualization

Type of crime incidents

In [184]:
len(crime['parent_incident_type'])
Out[184]:
280535
In [185]:
plt.figure(figsize=(12,5))
chart = sns.countplot(y='parent_incident_type', data=crime,palette='Spectral')
# set name for the plot
chart.set_title(f'{chart.get_ylabel().capitalize()}',fontweight='bold')
# add percentages for each bar
for p in chart.patches:
    chart.text(p.get_width(),p.get_y()+0.5,'{:1.2f}%'.format(p.get_width()*100/ float(len(crime['parent_incident_type']))),ha='left')
plt.show()
  • 43.62% - Almost haft of recorded crime incidents cases that happened in Buffalo are theft cases.

  • Top crime incidents are theft, assault, and breaking and entering.

Day of Week and Hour of Day

In [186]:
# create function to draw multiple countplots 
def plot_multiple_countplots(crime, cols,num_cols,num_rows, hue=None):
             
    fig, axs = plt.subplots(num_rows, num_cols,figsize=(20, 10))
  
    for index, col in enumerate(cols):
        i = math.floor(index/num_cols)
        j = index - i*num_cols      
        
        if num_rows == 1:
            if num_cols == 1:
                chart = sns.countplot(x=crime[col], ax=axs, hue = hue, palette='Spectral')              
            else:
                chart = sns.countplot(x=crime[col], ax=axs[j],hue = hue, palette='Spectral')                
        else:
            chart = sns.countplot(x=crime[col], ax=axs[i, j],hue = hue, palette='Spectral')         
        # rotate axis labels   
        chart.set_xticklabels(chart.get_xticklabels(), rotation=15, ha ='center')           
        # set names each countplot
        chart.set_title(f'{chart.get_xlabel().capitalize()}',fontweight='bold')              
        # add percentages on top of each bar
        for p in chart.patches:               
            chart.text(p.get_x(),p.get_height()+1,'{:1.2f}%'.format(p.get_height()*100/ float(len(crime[col]))),ha='left')
In [187]:
plot_multiple_countplots(crime, ['day_of_week','hour_of_day'],2,1)

1. Day of week:

  • Friday, Saturday are a little more dangerous than other days.

2. Hour of Day:

  • 12am is the time that most likely for crime incidents.
In [188]:
plt.figure(figsize=(17,10))
chart = sns.boxplot(x='parent_incident_type', y='hour_of_day',data=crime, hue ='day_of_week' , palette='Spectral')
chart.set_title(f'Timeline of Incident Types',fontweight='bold')
plt.show()
  • Most crime cases about proverty such as Theft, Theft of Vehicle, Breaking & Entering happpen around 8 a.m and 4 p.m - the time frame of working hours.

  • Cases about interaction conflict such as Assault, Robbery, Sexual Assault and Homicide have a fluctuated time frame.

Does most crime happen at weekend?

In [189]:
crime['Weekend'] = crime['day_of_week'].isin(['SATURDAY', 'SUNDAY'])
ax=sns.catplot(x='parent_incident_type', y='hour_of_day', hue='Weekend', kind='box', dodge=False, data=crime)
ax.fig.suptitle(f'Crime on Weekend',fontweight='bold')
ax.fig.set_size_inches(17,5)
  • The answer is YES. Most crimes occur more frequently at weekend.

Does most crime happen at night time?

In [190]:
x = [0,1,2,3,4,5,6,20,21,22,23]
crime['Night Time'] = crime['hour_of_day'].isin(x)

plt.figure(figsize=(12,7))
chart = sns.countplot(y='parent_incident_type', data=crime,hue='Night Time')
# set name for the plot
chart.set_title(f'Crime at Night',fontweight='bold')
# add percentages for each bar
for p in chart.patches:
    chart.text(p.get_width(),p.get_y()+0.25,'{:1.2f}%'.format(p.get_width()*100/ float(len(crime['parent_incident_type']))),ha='left')
plt.show()
  • Night time in this project is from 8 p.m to 6 a.m.

  • Only Theft , Breaking & Entering, and Sexual Offense occur more at day time. Because day time, especically from 8 a.m to 4 p.m is the time frame of office working hours. People leaving for work, stay in the office are good condition for thief and intruder.

  • All other types of crime incident occur more at night time.

Neighborhood and Location

In [191]:
plt.figure(figsize=(17,10))
chart = sns.countplot(y='neighborhood_1', data=crime,palette='Spectral')
# set name for the plot
chart.set_title(f'Neighborhood and Crime Cases',fontweight='bold')
for p in chart.patches:
    chart.text(p.get_width(),p.get_y()+0.5,'{:1.2f}%'.format(p.get_width()*100/ float(len(crime['neighborhood_1']))),ha='left')
plt.show()

High frequency of crime - Dangerous Neighborhoods :

  1. Broadway Fillmore

  2. Central

  3. Kensington-Bailey

  4. Noth Park

  5. Genesee-Moselle

Low frequency of crime - Safe Neighborhoods:

  1. First Ward

  2. Seneca Babcock

  3. Central Park

  4. Kaisertown

  5. Ellicott

In [192]:
plt.figure(figsize=(17,10))
chart = sns.countplot(y='neighborhood_1', data=crime, hue= 'parent_incident_type',palette='Spectral')
# set name for the plot
chart.set_title(f'Neighborhood with Incident Type',fontweight='bold')
plt.show()
  • North Park is the neighborhood where incident happened in highest frequency and most cases are theft.

  • Neighborhoods that suffered from Theft: Noth Park, Broadway Fillmore, Central, Kensington-Bailey, Elmwood Bidwell and Elmwood Bryant.

WHAT-IF No-Theft?

Because 43.62% recorded cases are theft cases, so to have a closer look in other incident types that happened in different neighborhoods, this step remove all the theft cases.

In [193]:
# Remove all Theft cases
crime2 = crime
crime2 =  crime2[crime2['parent_incident_type'].str.contains('Theft')==False]
# Draw chart
plt.figure(figsize=(17,10))
chart = sns.countplot(y='neighborhood_1', data=crime2, hue= 'parent_incident_type',palette='tab10')
# set name for the plot
chart.set_title(f'Neighborhood with Non-Theft Incident Type',fontweight='bold')
plt.show()
  • Without theft cases involved, neighborhoods are suffered from assault, breaking and entering.

  • High frequency of Assault: Broadway Fillmore, Genersee-Moselle, Schiller Park, Central, Kenfield and Delavan Crider.

  • High frequency of Breaking & Entering: Broadway Fillmore, Genersee-Moselle, Schiller Park, Kensington-Bailey, University Heights.

  • High frequency of Robbery: Broadway Fillmore, Genersee-Moselle, and University Heights.

  • Without theft cases involved, North Park is now no longer the most dangerous neighborhood.

In [194]:
plt.figure(figsize=(17,10))
chart = sns.countplot(y='neighborhood_1', hue = 'day_of_week',data=crime,palette='Spectral')
# set name for the plot
chart.set_title(f'Day of Crime in Neighborhood',fontweight='bold')
plt.show()
  • Central neigborhood is more dangrous at weekend.

  • All days of week arlarm: Broadway Fillmore, North Park, Kensington-Bailey, Schiller Park, Genersee-Moselle and Emlwood Bidwell.

Check duplicated addresses

In [195]:
crime.duplicated(subset=['address_1'],keep='first').sum()
Out[195]:
259787
  • It is interesting that total records number is 281601 and duplicated address number is 259787.

=> 92.25% addresses had more than 2 crime cases in records.

In [196]:
crime.loc[crime.duplicated(subset=['address_1'], keep='first'),:]
Out[196]:
case_number incident_datetime parent_incident_type hour_of_day day_of_week address_1 neighborhood_1 Weekend Night Time
42 22-1120947 2022-04-22T22:01:07.000 Theft 22 FRIDAY 3000 Block BAILEY AV Kensington-Bailey False True
43 22-1150192 2022-04-22T16:00:00.000 Theft 7 MONDAY 100 Block SUMMER ST Elmwood Bryant False False
48 22-1120347 2022-04-22T11:30:00.000 Theft 11 FRIDAY 2600 Block DELAWARE AV North Park False False
56 22-1120410 2022-04-22T12:44:00.000 Theft 12 FRIDAY 2100 Block ELMWOOD AV North Park False False
59 22-1170421 2022-04-27T14:15:32.000 Theft 14 WEDNESDAY 1500 Block HERTEL AV North Park False False
... ... ... ... ... ... ... ... ... ...
281597 13-0790754 2013-03-20T20:47:00.000 Breaking & Entering 20 WEDNESDAY 100 Block MOSELLE ST Genesee-Moselle False True
281598 15-1050782 2015-04-15T20:01:00.000 Assault 20 WEDNESDAY S PARK AV & PRIES AV Hopkins-Tifft False True
281599 06-1561060 2006-06-05T22:55:00.000 Robbery 22 MONDAY 1700 Block BROADWAY Genesee-Moselle False True
281600 17-2690527 2017-09-26T11:00:00.000 Breaking & Entering 11 TUESDAY 200 Block GROTE ST Grant-Amherst False False
281601 16-2251020 2016-08-12T20:10:00.000 Theft 20 FRIDAY 200 Block BRYANT ST Elmwood Bryant False True

259787 rows × 9 columns

IV. Spatial Data

In [197]:
%%time 

!apt install gdal-bin python-gdal python3-gdal 
# Install rtree - Geopandas requirment
!apt install python3-rtree 
# Install Geopandas
!pip install git+git://github.com/geopandas/geopandas.git
# Install descartes - Geopandas requirment
!pip install descartes
Reading package lists... Done
Building dependency tree       
Reading state information... Done
gdal-bin is already the newest version (2.2.3+dfsg-2).
python-gdal is already the newest version (2.2.3+dfsg-2).
python3-gdal is already the newest version (2.2.3+dfsg-2).
The following packages were automatically installed and are no longer required:
  libnvidia-common-460 nsight-compute-2020.2.0
Use 'apt autoremove' to remove them.
0 upgraded, 0 newly installed, 0 to remove and 42 not upgraded.
Reading package lists... Done
Building dependency tree       
Reading state information... Done
python3-rtree is already the newest version (0.8.3+ds-1).
The following packages were automatically installed and are no longer required:
  libnvidia-common-460 nsight-compute-2020.2.0
Use 'apt autoremove' to remove them.
0 upgraded, 0 newly installed, 0 to remove and 42 not upgraded.
Collecting git+git://github.com/geopandas/geopandas.git
  Cloning git://github.com/geopandas/geopandas.git to /tmp/pip-req-build-huj_le2_
  Running command git clone -q git://github.com/geopandas/geopandas.git /tmp/pip-req-build-huj_le2_
  fatal: remote error:
    The unauthenticated git protocol on port 9418 is no longer supported.
  Please see https://github.blog/2021-09-01-improving-git-protocol-security-github/ for more information.
WARNING: Discarding git+git://github.com/geopandas/geopandas.git. Command errored out with exit status 128: git clone -q git://github.com/geopandas/geopandas.git /tmp/pip-req-build-huj_le2_ Check the logs for full command output.
ERROR: Command errored out with exit status 128: git clone -q git://github.com/geopandas/geopandas.git /tmp/pip-req-build-huj_le2_ Check the logs for full command output.
Requirement already satisfied: descartes in /usr/local/lib/python3.7/dist-packages (1.1.0)
Requirement already satisfied: matplotlib in /usr/local/lib/python3.7/dist-packages (from descartes) (3.2.2)
Requirement already satisfied: numpy>=1.11 in /usr/local/lib/python3.7/dist-packages (from matplotlib->descartes) (1.21.6)
Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib->descartes) (2.8.2)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib->descartes) (3.0.8)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.7/dist-packages (from matplotlib->descartes) (0.11.0)
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib->descartes) (1.4.2)
Requirement already satisfied: typing-extensions in /usr/local/lib/python3.7/dist-packages (from kiwisolver>=1.0.1->matplotlib->descartes) (4.2.0)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/dist-packages (from python-dateutil>=2.1->matplotlib->descartes) (1.15.0)
CPU times: user 140 ms, sys: 203 ms, total: 343 ms
Wall time: 9.21 s
In [198]:
!pip install geopandas
Requirement already satisfied: geopandas in /usr/local/lib/python3.7/dist-packages (0.10.2)
Requirement already satisfied: fiona>=1.8 in /usr/local/lib/python3.7/dist-packages (from geopandas) (1.8.21)
Requirement already satisfied: pandas>=0.25.0 in /usr/local/lib/python3.7/dist-packages (from geopandas) (1.3.5)
Requirement already satisfied: shapely>=1.6 in /usr/local/lib/python3.7/dist-packages (from geopandas) (1.8.1.post1)
Requirement already satisfied: pyproj>=2.2.0 in /usr/local/lib/python3.7/dist-packages (from geopandas) (3.2.1)
Requirement already satisfied: six>=1.7 in /usr/local/lib/python3.7/dist-packages (from fiona>=1.8->geopandas) (1.15.0)
Requirement already satisfied: click>=4.0 in /usr/local/lib/python3.7/dist-packages (from fiona>=1.8->geopandas) (7.1.2)
Requirement already satisfied: certifi in /usr/local/lib/python3.7/dist-packages (from fiona>=1.8->geopandas) (2021.10.8)
Requirement already satisfied: attrs>=17 in /usr/local/lib/python3.7/dist-packages (from fiona>=1.8->geopandas) (21.4.0)
Requirement already satisfied: cligj>=0.5 in /usr/local/lib/python3.7/dist-packages (from fiona>=1.8->geopandas) (0.7.2)
Requirement already satisfied: click-plugins>=1.0 in /usr/local/lib/python3.7/dist-packages (from fiona>=1.8->geopandas) (1.1.1)
Requirement already satisfied: munch in /usr/local/lib/python3.7/dist-packages (from fiona>=1.8->geopandas) (2.5.0)
Requirement already satisfied: setuptools in /usr/local/lib/python3.7/dist-packages (from fiona>=1.8->geopandas) (57.4.0)
Requirement already satisfied: python-dateutil>=2.7.3 in /usr/local/lib/python3.7/dist-packages (from pandas>=0.25.0->geopandas) (2.8.2)
Requirement already satisfied: pytz>=2017.3 in /usr/local/lib/python3.7/dist-packages (from pandas>=0.25.0->geopandas) (2022.1)
Requirement already satisfied: numpy>=1.17.3 in /usr/local/lib/python3.7/dist-packages (from pandas>=0.25.0->geopandas) (1.21.6)
In [199]:
import geopandas as gpd
In [200]:
pd.set_option('display.max_columns',None) 
# Add $limit=300000 to read in all records, defalt is 1000 records.
crime_url = "https://data.buffalony.gov/resource/d6g9-xbgu.geojson?$limit=300000"
crime_gdf = gpd.read_file(crime_url)
#crime_gdf = gpd.read_file(crime_url, ignore_fields=["iso_a3", "gdp_md_est"])
crime_gdf.tail()
Out[200]:
city neighborhood_1 police_district latitude parent_incident_type state geoid20_block day_of_week incident_id tractce20 incident_description geoid20_tract census_block longitude census_block_2010 census_block_group census_block_group_2010 census_tract hour_of_day created_at address_1 geoid20_blockgroup incident_type_primary updated_at case_number census_tract_2010 incident_datetime council_district geometry
281597 Buffalo Genesee-Moselle District C 42.907 Breaking & Entering NY 360290034004001 Wednesday None 002900 Buffalo Police are investigating this report o... 36029002900 4001 -78.825 4001 4 4 29 20 2013-03-22T06:00:00 100 Block MOSELLE ST 360290002004 BURGLARY None 13-0790754 29 2013-03-20T20:47:00 FILLMORE POINT (-78.82500 42.90700)
281598 Buffalo Hopkins-Tifft District A 42.843 Assault NY 360290035022002 Wednesday None 000110 Buffalo Police are investigating this report o... 36029000110 2002 -78.826 4002 2 4 1.10 20 2015-04-16T06:06:00 S PARK AV & PRIES AV 360290001102 ASSAULT None 15-1050782 1.10 2015-04-15T20:01:00 SOUTH POINT (-78.82600 42.84300)
281599 Buffalo Genesee-Moselle District C 42.899 Robbery NY 360290045002009 Monday None 003000 Buffalo Police are investigating this report o... 36029003000 2009 -78.81 2005 2 2 30 22 2019-09-24T22:50:00 1700 Block BROADWAY 360290001102 ROBBERY None 06-1561060 30 2006-06-05T22:55:00 LOVEJOY POINT (-78.81000 42.89900)
281600 Buffalo Grant-Amherst District D 42.943 Breaking & Entering NY 360290035021000 Tuesday None 005500 Buffalo Police are investigating this report o... 36029005500 1000 -78.882 1000 1 1 55 11 2017-09-27T06:05:00 200 Block GROTE ST 360290001101 BURGLARY None 17-2690527 55 2017-09-26T11:00:00 NORTH POINT (-78.88200 42.94300)
281601 Buffalo Elmwood Bryant District B 42.909 Theft NY 360290035022002 Friday None 006702 Buffalo Police are investigating this report o... 36029006702 2002 -78.875 3001 2 3 67.02 20 2019-09-23T23:24:00 200 Block BRYANT ST 360290001102 LARCENY/THEFT None 16-2251020 67.02 2016-08-12T20:10:00 NIAGARA POINT (-78.87500 42.90900)
In [201]:
!pip install contextily
Requirement already satisfied: contextily in /usr/local/lib/python3.7/dist-packages (1.2.0)
Requirement already satisfied: matplotlib in /usr/local/lib/python3.7/dist-packages (from contextily) (3.2.2)
Requirement already satisfied: geopy in /usr/local/lib/python3.7/dist-packages (from contextily) (1.17.0)
Requirement already satisfied: mercantile in /usr/local/lib/python3.7/dist-packages (from contextily) (1.2.1)
Requirement already satisfied: rasterio in /usr/local/lib/python3.7/dist-packages (from contextily) (1.2.10)
Requirement already satisfied: requests in /usr/local/lib/python3.7/dist-packages (from contextily) (2.23.0)
Requirement already satisfied: pillow in /usr/local/lib/python3.7/dist-packages (from contextily) (7.1.2)
Requirement already satisfied: xyzservices in /usr/local/lib/python3.7/dist-packages (from contextily) (2022.4.0)
Requirement already satisfied: joblib in /usr/local/lib/python3.7/dist-packages (from contextily) (1.1.0)
Requirement already satisfied: geographiclib<2,>=1.49 in /usr/local/lib/python3.7/dist-packages (from geopy->contextily) (1.52)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.7/dist-packages (from matplotlib->contextily) (0.11.0)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib->contextily) (3.0.8)
Requirement already satisfied: numpy>=1.11 in /usr/local/lib/python3.7/dist-packages (from matplotlib->contextily) (1.21.6)
Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib->contextily) (2.8.2)
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib->contextily) (1.4.2)
Requirement already satisfied: typing-extensions in /usr/local/lib/python3.7/dist-packages (from kiwisolver>=1.0.1->matplotlib->contextily) (4.2.0)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/dist-packages (from python-dateutil>=2.1->matplotlib->contextily) (1.15.0)
Requirement already satisfied: click>=3.0 in /usr/local/lib/python3.7/dist-packages (from mercantile->contextily) (7.1.2)
Requirement already satisfied: snuggs>=1.4.1 in /usr/local/lib/python3.7/dist-packages (from rasterio->contextily) (1.4.7)
Requirement already satisfied: certifi in /usr/local/lib/python3.7/dist-packages (from rasterio->contextily) (2021.10.8)
Requirement already satisfied: affine in /usr/local/lib/python3.7/dist-packages (from rasterio->contextily) (2.3.1)
Requirement already satisfied: click-plugins in /usr/local/lib/python3.7/dist-packages (from rasterio->contextily) (1.1.1)
Requirement already satisfied: setuptools in /usr/local/lib/python3.7/dist-packages (from rasterio->contextily) (57.4.0)
Requirement already satisfied: attrs in /usr/local/lib/python3.7/dist-packages (from rasterio->contextily) (21.4.0)
Requirement already satisfied: cligj>=0.5 in /usr/local/lib/python3.7/dist-packages (from rasterio->contextily) (0.7.2)
Requirement already satisfied: chardet<4,>=3.0.2 in /usr/local/lib/python3.7/dist-packages (from requests->contextily) (3.0.4)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.7/dist-packages (from requests->contextily) (1.24.3)
Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.7/dist-packages (from requests->contextily) (2.10)
In [202]:
import contextily as ctx
%matplotlib inline
In [203]:
crime_gdf.drop(['incident_id','updated_at'], axis=1,inplace=True)
crime_gdf.head()
Out[203]:
city neighborhood_1 police_district latitude parent_incident_type state geoid20_block day_of_week tractce20 incident_description geoid20_tract census_block longitude census_block_2010 census_block_group census_block_group_2010 census_tract hour_of_day created_at address_1 geoid20_blockgroup incident_type_primary case_number census_tract_2010 incident_datetime council_district geometry
0 Buffalo Schiller Park District E 42.918 Theft NY 360290037001005 Thursday 003700 Buffalo Police are investigating this report o... 36029003700 1005 -78.804 1005 1 1 37 7 2022-04-21T07:08:28 0 Block ROGERS AV 360290037001 LARCENY/THEFT 22-1110134 37 2022-04-21T07:08:28 LOVEJOY POINT (-78.80400 42.91800)
1 Buffalo Kensington-Bailey District E 42.939 Theft NY 360290044014004 Thursday 004401 Buffalo Police are investigating this report o... 36029004401 4004 -78.8 4004 4 4 44.01 23 2022-04-21T23:28:41 400 Block EGGERT RD 360290044014 LARCENY/THEFT 22-1110924 44.01 2022-04-21T17:30:41 UNIVERSITY POINT (-78.80000 42.93900)
2 Buffalo North Park District D 42.952 Theft NY 360290050003005 Friday 005000 Buffalo Police are investigating this report o... 36029005000 3005 -78.876 3010 3 3 50 15 2022-04-22T15:18:07 1900 Block ELMWOOD AV 360290050003 LARCENY/THEFT 22-1120542 50 2022-04-22T15:18:07 NORTH POINT (-78.87600 42.95200)
3 Buffalo First Ward District A 42.868 Theft NY 360290005001000 Friday 000500 Buffalo Police are investigating this report o... 36029000500 1000 -78.85 1000 1 1 5 22 2022-04-22T22:25:34 800 Block S PARK AV 360290005001 LARCENY/THEFT 22-1120960 5 2022-04-22T22:00:34 FILLMORE POINT (-78.85000 42.86800)
4 Buffalo Ellicott District A 42.874 Theft NY 360290164001003 Sunday 016400 Buffalo Police are investigating this report o... 36029016400 1003 -78.871 1011 1 1 164 2 2022-04-24T02:01:30 0 Block FULTON ST 360290164001 LARCENY/THEFT 22-1140104 164 2022-04-24T00:58:30 FILLMORE POINT (-78.87100 42.87400)

Check the Coordinate Reference System(CRS)

CRS defines how the two-dimensional, projected map in Geographic information system (GIS) relates to real places on the earth.

Check the CRS and change it to epsg:3857 to be able to draw plots.

In [204]:
# Check crs
crime_gdf.crs
Out[204]:
<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World.
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984 ensemble
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich
In [205]:
# Change crs
crime_gdf.to_crs('epsg:3857',inplace=True)

Check and drop missing geometry rows

In [206]:
crime_gdf.shape
Out[206]:
(281602, 27)
In [207]:
orig_rows = crime_gdf.shape[0]
crime_gdf = crime_gdf.loc[crime_gdf.geometry.notnull()]
print(f'Records with missing location information = {orig_rows-crime_gdf.shape[0]}')
Records with missing location information = 3959
In [208]:
#crime_gdf.geometry=crime_gdf.geometry.astype(float)
crime_gdf.dropna(subset =['geometry'], how='any',inplace=True)
#crime_gdf.dropna( how='any',inplace=True)
crime_gdf.shape
Out[208]:
(277643, 27)

Delete 3959 records that is missed location information because they are not useful and cannot show on the map.

Mapping crimes by neighborhoods

In [209]:
fig, ax = plt.subplots(figsize=(15,15), subplot_kw=dict(aspect='equal'))
crime_gdf.plot(column=crime_gdf['neighborhood_1'], ax=ax);
ax.set_title('Crime Incident Locations of Buffalo by Neighborhood',fontdict={'fontsize': '25', 'fontweight' : '3'})
ax.set_axis_off()
ctx.add_basemap(ax)

There are some bad geometry data that the locations are not in NY state.

So the map is so big, it is not only Buffalo area.

In [210]:
#crime_gdf = crime_gdf.GeoDataFrame.drop(columns=['incident_id'],  axis=1, inplace=True)
crime_gdf.council_district.unique()
Out[210]:
array(['LOVEJOY', 'UNIVERSITY', 'NORTH', 'FILLMORE', 'ELLICOTT', 'MASTEN',
       'NIAGARA', 'SOUTH', 'DELAWARE', 'UNKNOWN'], dtype=object)

There are 'UNKNOWN' council district in the dataset that it cause above problem when mapping. To solve this problem, fixing it by removing the UNKNOWN council_district.

In [211]:
# set council_district as index of the dataframe
crime_gdf.set_index('council_district',inplace=True)
crime_gdf.head()
Out[211]:
city neighborhood_1 police_district latitude parent_incident_type state geoid20_block day_of_week tractce20 incident_description geoid20_tract census_block longitude census_block_2010 census_block_group census_block_group_2010 census_tract hour_of_day created_at address_1 geoid20_blockgroup incident_type_primary case_number census_tract_2010 incident_datetime geometry
council_district
LOVEJOY Buffalo Schiller Park District E 42.918 Theft NY 360290037001005 Thursday 003700 Buffalo Police are investigating this report o... 36029003700 1005 -78.804 1005 1 1 37 7 2022-04-21T07:08:28 0 Block ROGERS AV 360290037001 LARCENY/THEFT 22-1110134 37 2022-04-21T07:08:28 POINT (-8772421.152 5299498.928)
UNIVERSITY Buffalo Kensington-Bailey District E 42.939 Theft NY 360290044014004 Thursday 004401 Buffalo Police are investigating this report o... 36029004401 4004 -78.8 4004 4 4 44.01 23 2022-04-21T23:28:41 400 Block EGGERT RD 360290044014 LARCENY/THEFT 22-1110924 44.01 2022-04-21T17:30:41 POINT (-8771975.875 5302691.629)
NORTH Buffalo North Park District D 42.952 Theft NY 360290050003005 Friday 005000 Buffalo Police are investigating this report o... 36029005000 3005 -78.876 3010 3 3 50 15 2022-04-22T15:18:07 1900 Block ELMWOOD AV 360290050003 LARCENY/THEFT 22-1120542 50 2022-04-22T15:18:07 POINT (-8780436.156 5304668.609)
FILLMORE Buffalo First Ward District A 42.868 Theft NY 360290005001000 Friday 000500 Buffalo Police are investigating this report o... 36029000500 1000 -78.85 1000 1 1 5 22 2022-04-22T22:25:34 800 Block S PARK AV 360290005001 LARCENY/THEFT 22-1120960 5 2022-04-22T22:00:34 POINT (-8777541.849 5291901.635)
FILLMORE Buffalo Ellicott District A 42.874 Theft NY 360290164001003 Sunday 016400 Buffalo Police are investigating this report o... 36029016400 1003 -78.871 1011 1 1 164 2 2022-04-24T02:01:30 0 Block FULTON ST 360290164001 LARCENY/THEFT 22-1140104 164 2022-04-24T00:58:30 POINT (-8779879.558 5292812.985)
In [212]:
crime_gdf.drop(['UNKNOWN'] , axis=0,inplace=True)
In [213]:
fig, ax = plt.subplots(figsize=(15,15), subplot_kw=dict(aspect='equal'))
crime_gdf.plot(column=crime_gdf['neighborhood_1'], ax=ax);
ax.set_title('Crime Incident Locations of Buffalo by Neighborhood',fontdict={'fontsize': '25', 'fontweight' : '3'})
ax.set_axis_off()
ctx.add_basemap(ax)

After delete UNKNOWN council district records, the map now available show all cases in Buffalo area. Based on the map, almost every places in Buffalo have a record of crime incidents. Only the Sounth and Delaware County Districts show some blank area with no crime records. These area are parks.

Mapping by crime types

In [214]:
crime_gdf.parent_incident_type.unique()
Out[214]:
array(['Theft', 'Theft of Vehicle', 'Sexual Offense',
       'Breaking & Entering', 'Assault', 'Robbery', 'Homicide',
       'Other Sexual Offense', 'Sexual Assault'], dtype=object)

There are 9 types of crime in the dataset. This part is drawing plot that tell different of 2 crime types: Assault and Homicide.

In [215]:
crime_gdf.reset_index(inplace=True)
In [216]:
crime_gdf['conrank'] = 'lightgray'
#crime_gdf.loc[crime_gdf.parent_incident_type == 'Theft','conrank']='red'
crime_gdf.loc[crime_gdf.parent_incident_type == 'Assault','conrank']='blue'
#crime_gdf.loc[crime_gdf.parent_incident_type == 'Robbery','conrank']='purple'
#crime_gdf.loc[crime_gdf.parent_incident_type == 'Theft of Vehicle','conrank']='organce'
#crime_gdf.loc[crime_gdf.parent_incident_type == 'Breaking & Entering','conrank']='yellow'
#crime_gdf.loc[crime_gdf.parent_incident_type == 'Sexual Offense','conrank']='violet'
#crime_gdf.loc[crime_gdf.parent_incident_type == 'Other Sexual Offense','conrank']='brown'
#crime_gdf.loc[crime_gdf.parent_incident_type == 'Sexual Assault','conrank']='lime'
crime_gdf.loc[crime_gdf.parent_incident_type == 'Homicide','conrank']='deepPink'
crime_gdf.loc[~crime_gdf.parent_incident_type.isin(['Homicide','Assault']),'conrank']='gray'
In [217]:
import matplotlib.lines as mlines

fig, ax = plt.subplots(figsize=(12,12), subplot_kw=dict(aspect='equal'))

deepPink_marker = mlines.Line2D([], [], color='deepPink', marker='.', linestyle='None',
                          markersize=10,label='Homicide')
blue_marker = mlines.Line2D([], [], color='blue', marker='.', linestyle='None',
                          markersize=10,label='Assault')
gray_marker=mlines.Line2D([], [], color='gray', marker='.', linestyle='None',
                          markersize=10, label='Other types')
ax.legend(handles=[deepPink_marker,blue_marker,gray_marker])

crime_gdf.plot(color=crime_gdf['conrank'], ax=ax)
ax.set_title('Buffalo Assault and Homicide Crime Cases',fontdict={'fontsize': '25', 'fontweight' : '3'})
ax.set_axis_off()
ctx.add_basemap(ax)

This map is showing the different about locations and number of cases in Assault type and Homicide type. Assault is the second most common type of crime that happended in Buffalo. Assault incidents were occurred a lot compare to Homicide incidents.

Although the quantity of cases is different, Assault and Homicide cases are both scattered occurred all around Buffalo.

Mapping Duplicated locations

There are a lot of locations that has more than 1 recorded crime cases. This part is to show the duplicated addresses on the dataset.

In [218]:
# Total duplcatated address here is smaller than above because I did remove some rows that missing geometry 
crime_gdf.duplicated(subset=['address_1'],keep='first').sum()
Out[218]:
256743
In [219]:
fig, ax = plt.subplots(figsize=(15,15), subplot_kw=dict(aspect='equal'))
crime_gdf.plot(column=crime_gdf.duplicated(subset=['address_1'],keep='first'), ax=ax);
ax.set_title('>= 2 Crime Incidents Cases Locations of Buffalo',fontdict={'fontsize': '25', 'fontweight' : '3'})
ax.set_axis_off()
ctx.add_basemap(ax)

Yellow dots are locations of places where crime happened more than 2 times, and black dots are locations that only have 1 crime case in the dataset. There is a definitely different about quantity of these two category, more than 92% of locations had more than 2 crime cases in records.

Theft and Homicide Point Frequency Maps

Point locations represent where the actual event occurred. This approach is only viable if there are point locations with multiple occurrences of the geographic event under consideration.

Bokeh

In [220]:
from bokeh.tile_providers import CARTODBPOSITRON, get_provider
tileProvider = get_provider('CARTODBPOSITRON_RETINA')

from bokeh.io import output_notebook, show, output_file, save
from bokeh.plotting import figure
from bokeh.models import HoverTool, GeoJSONDataSource
from bokeh.layouts import row,column
from bokeh.models.widgets import Div

output_notebook()

TOOLS = "pan,wheel_zoom,box_zoom,reset,save"
In [221]:
kwargs = {"plot_width":800,
          "plot_height":700,
          "sizing_mode":'scale_both',
          "outline_line_color":'#046626',
          "outline_line_width":3,
          "outline_line_alpha":.3,
          'toolbar_location':'above',
          'border_fill_color':'#4287f5',
          'border_fill_alpha':.3,
          'min_border_left': 20,
          'min_border_right':20,
          'min_border_top': 10,
          'min_border_bottom':20}
In [222]:
# Check null geometry 
orig_rows = crime_gdf.shape[0] 
crime_gdf = crime_gdf.loc[crime_gdf.geometry.notnull()]
print(f'Records with missing location information = {orig_rows-crime_gdf.shape[0]:,.0f}\n\
Percent missing = {((orig_rows-crime_gdf.shape[0])/orig_rows)*100:,.0f}%')
Records with missing location information = 0
Percent missing = 0%

Create a unique keys

This key is combine of Latitude and Longitude of locations that crimes happened more than 1 times

In [223]:
crime_gdf['newLoc'] = crime_gdf.geometry.x.astype(str)+ crime_gdf.geometry.y.astype(str)
In [224]:
numlocs = crime_gdf.newLoc.value_counts().rename_axis('uniquepts').to_frame('counts')
numlocs.head()
Out[224]:
counts
uniquepts
-8773756.9863626515302843.689697811 1033
-8775204.1397429615299498.92780969 1018
-8780658.7947918335304972.796811193 924
-8780213.5168286585305124.894422529 795
-8781104.0727550075295243.684788868 783

At some locations, crime incidents occurred in highly high rate. For example, at the location -8773756.9863626515302843.689697811 only, there were 1033 crime cases!

In [225]:
crime_gdf.geometry.value_counts().sum()
Out[225]:
277169
In [226]:
# Remove duplicate
uHl = crime_gdf.drop_duplicates(subset='newLoc').reset_index()
uHl.tail()
Out[226]:
index council_district city neighborhood_1 police_district latitude parent_incident_type state geoid20_block day_of_week tractce20 incident_description geoid20_tract census_block longitude census_block_2010 census_block_group census_block_group_2010 census_tract hour_of_day created_at address_1 geoid20_blockgroup incident_type_primary case_number census_tract_2010 incident_datetime geometry conrank newLoc
8540 274958 NORTH Buffalo Elmwood Bidwell District D 42.934 Theft of Vehicle NY 360290035021000 Friday 980000 Buffalo Police are investigating this report o... 36029980000 1000 -78.88 1000 1 1 9800 21 2014-12-06T07:05:00 E DIST 360290001101 UUV 14-3390890 62.01 2014-12-05T21:55:00 POINT (-8780881.434 5301931.363) gray -8780881.4337734195301931.3634219635
8541 276157 DELAWARE Buffalo North Park District D 42.959 Robbery NY 360290035021006 Sunday 005100 Buffalo Police are investigating this report o... 36029005100 1006 -78.851 1006 1 1 51 23 2010-11-02T06:01:00 900 Block KENMORE AV 360290001101 ROBBERY 10-3041077 51 2010-10-31T23:45:00 POINT (-8777653.169 5305733.310) gray -8777653.1685404145305733.309584979
8542 276386 ELLICOTT Buffalo Central District B 42.884 Theft NY 360290025021007 Monday 007102 Buffalo Police are investigating this report o... 36029007102 1007 -78.883 1011 1 1 71.02 23 2013-11-28T07:02:00 100 Block CHURCH ST 360290001101 LARCENY/THEFT 13-3220987 71.02 2013-11-18T23:00:00 POINT (-8781215.392 5294332.098) gray -8781215.39224585294332.098356516
8543 276701 NORTH Buffalo Black Rock District D 42.946 Breaking & Entering NY 360290035021003 Saturday 005700 Buffalo Police are investigating this report o... 36029005700 1003 -78.897 1009 1 1 57 23 2019-09-24T22:47:00 1 Block RIVER ROCK DR 360290001101 BURGLARY 06-2871431 57 2006-10-14T23:01:00 POINT (-8782773.865 5303756.105) gray -8782773.8651169055303756.104882298
8544 277013 FILLMORE Buffalo First Ward District A 42.868 Theft NY 360290165001019 Thursday 016400 Buffalo Police are investigating this report o... 36029016400 1019 -78.869 1025 1 1 164 10 2015-05-09T06:06:00 300 Block OHIO ST 360290001101 LARCENY/THEFT 15-1280373 164 2015-05-07T10:53:00 POINT (-8779656.919 5291901.635) gray -8779656.9193746935291901.634548973
In [227]:
uHl.parent_incident_type.unique()
Out[227]:
array(['Theft', 'Theft of Vehicle', 'Sexual Offense',
       'Breaking & Entering', 'Assault', 'Robbery', 'Homicide',
       'Other Sexual Offense', 'Sexual Assault'], dtype=object)
In [228]:
allHl = pd.merge(uHl,numlocs,left_on='newLoc',right_on='uniquepts').drop(['newLoc'],axis=1)
print(f'Number of locations: {allHl.shape[0]}\n\
accounting for {allHl.counts.sum()} cases of crime incidents in Buffalo')
Number of locations: 8545
accounting for 277169 cases of crime incidents in Buffalo

Map

Wondering about locations of theft cases which is the most frequency crime type and homicide cases which is the most dangerous crime type.

In [229]:
# Theft cases 
theftcases = allHl.loc[allHl.parent_incident_type	=='Theft'].copy()
print(f'Number of Locations Theft cases: {theftcases.shape[0]}\n\
Accounting for {allHl.counts.sum()} cases of crime incidents in Buffalo')
Number of Locations Theft cases: 3748
Accounting for 277169 cases of crime incidents in Buffalo
In [230]:
# Homicide cases 
homicidecases = allHl.loc[allHl.parent_incident_type	=='Homicide'].copy()
print(f'Number of Locations for Homicide cases: {homicidecases.shape[0]}\n\
Accounting for {allHl.counts.sum()} cases of crime incidents in Buffalo')
Number of Locations for Homicide cases: 38
Accounting for 277169 cases of crime incidents in Buffalo
In [231]:
maxcir = 60
maxcnt = theftcases.counts.max()
theftcases['radius']=(theftcases.counts/maxcnt*maxcir)
theftcases['radius']=theftcases['radius'].astype(float).round().astype(int)
theftcases.head()
Out[231]:
index council_district city neighborhood_1 police_district latitude parent_incident_type state geoid20_block day_of_week tractce20 incident_description geoid20_tract census_block longitude census_block_2010 census_block_group census_block_group_2010 census_tract hour_of_day created_at address_1 geoid20_blockgroup incident_type_primary case_number census_tract_2010 incident_datetime geometry conrank counts radius
0 0 LOVEJOY Buffalo Schiller Park District E 42.918 Theft NY 360290037001005 Thursday 003700 Buffalo Police are investigating this report o... 36029003700 1005 -78.804 1005 1 1 37 7 2022-04-21T07:08:28 0 Block ROGERS AV 360290037001 LARCENY/THEFT 22-1110134 37 2022-04-21T07:08:28 POINT (-8772421.152 5299498.928) gray 66 4
1 1 UNIVERSITY Buffalo Kensington-Bailey District E 42.939 Theft NY 360290044014004 Thursday 004401 Buffalo Police are investigating this report o... 36029004401 4004 -78.8 4004 4 4 44.01 23 2022-04-21T23:28:41 400 Block EGGERT RD 360290044014 LARCENY/THEFT 22-1110924 44.01 2022-04-21T17:30:41 POINT (-8771975.875 5302691.629) gray 36 2
2 2 NORTH Buffalo North Park District D 42.952 Theft NY 360290050003005 Friday 005000 Buffalo Police are investigating this report o... 36029005000 3005 -78.876 3010 3 3 50 15 2022-04-22T15:18:07 1900 Block ELMWOOD AV 360290050003 LARCENY/THEFT 22-1120542 50 2022-04-22T15:18:07 POINT (-8780436.156 5304668.609) gray 77 4
3 3 FILLMORE Buffalo First Ward District A 42.868 Theft NY 360290005001000 Friday 000500 Buffalo Police are investigating this report o... 36029000500 1000 -78.85 1000 1 1 5 22 2022-04-22T22:25:34 800 Block S PARK AV 360290005001 LARCENY/THEFT 22-1120960 5 2022-04-22T22:00:34 POINT (-8777541.849 5291901.635) gray 25 1
4 4 FILLMORE Buffalo Ellicott District A 42.874 Theft NY 360290164001003 Sunday 016400 Buffalo Police are investigating this report o... 36029016400 1003 -78.871 1011 1 1 164 2 2022-04-24T02:01:30 0 Block FULTON ST 360290164001 LARCENY/THEFT 22-1140104 164 2022-04-24T00:58:30 POINT (-8779879.558 5292812.985) gray 529 31
In [232]:
maxcir = 60
maxcnt = homicidecases.counts.max()
homicidecases['radius']=(homicidecases.counts/maxcnt*maxcir)
homicidecases['radius']=homicidecases['radius'].astype(float).round().astype(int)
homicidecases.head()
Out[232]:
index council_district city neighborhood_1 police_district latitude parent_incident_type state geoid20_block day_of_week tractce20 incident_description geoid20_tract census_block longitude census_block_2010 census_block_group census_block_group_2010 census_tract hour_of_day created_at address_1 geoid20_blockgroup incident_type_primary case_number census_tract_2010 incident_datetime geometry conrank counts radius
106 107 ELLICOTT Buffalo Masten Park District C 42.905 Homicide NY 360290033024003 Sunday 003302 Buffalo Police are investigating this report o... 36029003302 4003 -78.851 4003 4 4 33.02 15 2022-04-24T15:50:11 400 Block DODGE ST 360290033024 MURDER 22-1140684 33.02 2022-04-24T15:50:11 POINT (-8777653.169 5297523.039) deepPink 22 7
144 151 UNIVERSITY Buffalo Kensington-Bailey District E 42.934 Homicide NY 360290044012008 Saturday 004401 Buffalo Police are investigating this report o... 36029004401 2008 -78.81 2008 2 2 44.01 19 2022-04-30T19:43:13 200 Block MARTHA AV 360290044012 MURDER 22-1200845 44.01 2022-04-30T19:43:13 POINT (-8773089.069 5301931.363) deepPink 70 22
260 273 LOVEJOY Buffalo Schiller Park District E 42.923 Homicide NY 360290037001001 Friday 003700 Buffalo Police are investigating this report o... 36029003700 1001 -78.807 1001 1 1 37 22 2022-05-06T22:09:55 1300 Block E DELAVAN AV 360290037001 MURDER 22-1261042 37 2022-05-06T22:09:55 POINT (-8772755.111 5300258.996) deepPink 179 55
283 299 MASTEN Buffalo MLK Park District C 42.914 Homicide NY 360290035012004 Wednesday 003501 Buffalo Police are investigating this report o... 36029003501 2004 -78.828 3003 2 3 35.01 1 2022-05-04T01:00:15 0 Block DONOVAN DR 360290035012 MURDER 22-1240032 35 2022-05-04T01:00:15 POINT (-8775092.820 5298890.918) deepPink 195 60
294 312 FILLMORE Buffalo Lovejoy District C 42.895 Homicide NY 360290024005007 Wednesday 002400 Buffalo Police are investigating this report o... 36029002400 5007 -78.825 2006 5 2 24 21 2022-05-04T21:25:20 100 Block PERSON ST 360290024005 MURDER 22-1240918 24 2022-05-04T21:25:20 POINT (-8774758.862 5296003.408) deepPink 59 18
In [233]:
theftcases.to_crs('epsg:3857',inplace=True)
homicidecases.to_crs('epsg:3857',inplace=True)
output_file("/content/CrimePointFrequencyMaps.html",
            title="Locations with Frequency Crime Incidents in Buffalo")

f1 = figure(title = "Location of Theft cases in Buffalo", tools=TOOLS, toolbar_sticky=False,**kwargs)
f2 = figure(title = "Location of Homicide cases in Buffalo", tools=TOOLS, toolbar_sticky=False,**kwargs,
            x_range=f1.x_range,y_range=f1.y_range)

f1.add_tile(tileProvider)
f1.title.text_font_style = 'italic'
f1.title.text_font_size = '14pt'
f1.axis.visible=False 

f2.add_tile(tileProvider)
f2.title.text_font_style = 'italic'
f2.title.text_font_size = '14pt'
f2.axis.visible=False 

point_source_1 = GeoJSONDataSource(geojson=theftcases.to_json())
point_source_2 = GeoJSONDataSource(geojson=homicidecases.to_json())


Circle1=f1.circle('x','y',size='radius',fill_color='blue',line_color='blue',fill_alpha=0.5,source=point_source_1)
Circle2=f2.circle('x','y',size='radius',fill_color='red',line_color='red',fill_alpha=0.5,source=point_source_2)


c_hover= HoverTool(renderers=[Circle1])
c_hover.point_policy = "follow_mouse"
c_hover.tooltips=[("Address","@address_1," "@council_district"),
                  ("   " , "    "),
                  ("Number of Cases","@counts")]

f1.add_tools(c_hover)

c2_hover= HoverTool(renderers=[Circle2])
c2_hover.point_policy = "follow_mouse"
c2_hover.tooltips=[("Address","@address_1," "@council_district"),
                  ("   " , "    "),
                  ("Number of Cases","@counts")]

f2.add_tools(c2_hover)

heading = Div(text="""<h1>Point Frequency Maps</h1>\
<p> The two maps below show locations and frequencies of theft and homicide crime cases in Buffalo.\
On the left, proportional point symbols show locations of theft cases and on the right are locations of homicide.</p>\
<p> Use the tools to the right of each map to pan, zoom, etc... \
Hover over a property to see the address and number of cases.</p> \
<p><b><i>Data Source</i></b> =<a href = https://data.cityofnewyork.us/Housing-Development/Housing-Litigations/59kj-x8nc target='_blank'>NYC Open Data.</a></p>.\
<p style="font-size:9px;">Maps created 4/10/2022 by Nguyet Que T. Tran.</p>""", sizing_mode="stretch_both")

layout = column(heading, row(f1,f2),sizing_mode='stretch_both',margin=(5,5,5,5))
show(layout)

The map showing the location of crime incidents that were occured. Each point is geocoded to the actual location of an address/house/store.

The size of the symbol at each point location represents the number of crime that were happened at the location. The higher cases, the larger cicle size.

Which Council Districts is the most safest place to live?

Point Distribution Map

In [234]:
# Buffalo Council Districts dataset
api_url="https://data.buffalony.gov/resource/u5mx-ugvy.geojson"
cd_gdf=gpd.read_file(api_url)
cd_gdf.tail()
Out[234]:
dist_id dist_name shape_leng objectid_1 geometry
5 9 NIAGARA 0.14931438999999999 9 MULTIPOLYGON (((-78.89588 42.92591, -78.89457 ...
6 7 UNIVERSITY 0.18683203000000001 8 MULTIPOLYGON (((-78.80780 42.95894, -78.80774 ...
7 8 MASTEN 0.20357618 10 MULTIPOLYGON (((-78.82813 42.94033, -78.82812 ...
8 4 SOUTH 0.51072598000000002 5 MULTIPOLYGON (((-78.88394 42.87750, -78.88360 ...
9 2 NORTH 0.20635665 3 MULTIPOLYGON (((-78.89236 42.96107, -78.89061 ...
In [235]:
crime_gdf = crime_gdf.to_crs('epsg:3857')
cd_gdf = cd_gdf.to_crs('epsg:3857')
In [236]:
joindf = gpd.sjoin(crime_gdf,cd_gdf,how='inner',op='intersects')
/usr/local/lib/python3.7/dist-packages/IPython/core/interactiveshell.py:2822: FutureWarning: The `op` parameter is deprecated and will be removed in a future release. Please use the `predicate` parameter instead.
  if self.run_code(code, result):
In [237]:
joindf.tail()
Out[237]:
council_district city neighborhood_1 police_district latitude parent_incident_type state geoid20_block day_of_week tractce20 incident_description geoid20_tract census_block longitude census_block_2010 census_block_group census_block_group_2010 census_tract hour_of_day created_at address_1 geoid20_blockgroup incident_type_primary case_number census_tract_2010 incident_datetime geometry conrank newLoc index_right dist_id dist_name shape_leng objectid_1
262997 SOUTH Buffalo Seneca-Cazenovia District A 42.863 Breaking & Entering NY 360290035021003 Friday 001100 Buffalo Police are investigating this report o... 36029001100 1003 -78.811 1008 1 1 11 10 2011-08-10T06:01:00 100 Block PAWNEE PW 360290001101 BURGLARY 11-2210517 11 2011-08-05T10:00:00 POINT (-8773200.389 5291142.244) gray -8773200.3889086845291142.243814516 2 0 UnAssigned 0.47838895999999997 1
264079 SOUTH Buffalo South Park District A 42.853 Theft NY 360290163002006 Saturday 000900 Buffalo Police are investigating this report o... 36029000900 2006 -78.812 2006 2 2 9 23 2011-07-11T06:00:00 400 Block S LEGION DR 360290001102 LARCENY/THEFT 11-1910354 9 2011-07-09T23:45:00 POINT (-8773311.708 5289623.647) gray -8773311.7083994765289623.646827109 2 0 UnAssigned 0.47838895999999997 1
264565 FILLMORE Buffalo Central District B 42.878 Theft NY 360290036001010 Friday 007202 Buffalo Police are investigating this report o... 36029007202 1010 -78.882 1012 1 1 72.02 12 2013-08-25T06:01:00 1 Block MARINE DR 360290001101 LARCENY/THEFT 13-2360399 72.02 2013-08-23T12:00:00 POINT (-8781104.073 5293420.601) gray -8781104.0727550075293420.600578126 2 0 UnAssigned 0.47838895999999997 1
266415 FILLMORE Buffalo Central District B 42.881 Theft NY 360290036001010 Saturday 007202 Buffalo Police are investigating this report o... 36029007202 1010 -78.889 1012 1 1 72.02 11 2019-09-24T22:42:00 1 Block ERIE BASIN MARINA ST 360290001101 LARCENY/THEFT 07-2020586 72.02 2007-07-21T11:50:00 POINT (-8781883.309 5293876.338) gray -8781883.3091905585293876.338387279 2 0 UnAssigned 0.47838895999999997 1
269317 SOUTH Buffalo Seneca-Cazenovia District A 42.859 Theft NY 360290033022001 Saturday 001100 Buffalo Police are investigating this report o... 36029001100 2001 -78.821 2001 2 2 11 1 2010-11-07T06:01:00 1 Block REMINGTON PL 360290001102 LARCENY/THEFT 10-3100287 11 2010-11-06T01:30:00 POINT (-8774313.584 5290534.776) gray -8774313.5838166165290534.775507046 2 0 UnAssigned 0.47838895999999997 1
In [238]:
joindf['council_district']=joindf.council_district.astype(str)
ct = joindf.copy()
ct = ct.council_district.groupby(joindf['council_district']).count().sort_values(ascending=False)
ctdf=ct.to_frame(name='counts').reset_index()
In [239]:
ctdf.tail()
Out[239]:
council_district counts
4 FILLMORE 31575
5 MASTEN 31215
6 NIAGARA 30567
7 DELAWARE 18578
8 SOUTH 17308
In [240]:
nCases = pd.merge(cd_gdf,ctdf,left_on="dist_name",right_on="council_district")
nCases['centroids'] =nCases['geometry'].centroid
nCases = nCases.set_geometry('centroids')
In [241]:
maxcir = 60
maxcnt = nCases.counts.max()
nCases['radius']=(nCases.counts/maxcnt*maxcir)
nCases['radius']=nCases['radius'].astype(float).round().astype(int)
nCases.head()
Out[241]:
dist_id dist_name shape_leng objectid_1 geometry council_district counts centroids radius
0 5 DELAWARE 0.19087878 6 MULTIPOLYGON (((-8778099.466 5305679.993, -877... DELAWARE 18578 POINT (-8778543.920 5302638.853) 25
1 3 FILLMORE 0.42294144 4 MULTIPOLYGON (((-8773501.858 5299351.099, -877... FILLMORE 31575 POINT (-8776864.657 5294510.894) 42
2 1 ELLICOTT 0.32953199 2 MULTIPOLYGON (((-8778908.981 5299381.803, -877... ELLICOTT 45497 POINT (-8779145.197 5296280.698) 60
3 6 LOVEJOY 0.35163747000000001 7 MULTIPOLYGON (((-8773498.518 5300355.298, -877... LOVEJOY 32587 POINT (-8773355.009 5294709.224) 43
4 9 NIAGARA 0.14931438999999999 9 MULTIPOLYGON (((-8782649.206 5300701.761, -878... NIAGARA 30567 POINT (-8781916.166 5298678.305) 40
In [242]:
output_file("/content/CrimeDistributionMaps.html",
            title="Crime Incidents by Council Districts in Buffalo")
f1 = figure(title = "Crime incident cases in Buffalo by Council Districts", tools=TOOLS, toolbar_sticky=False,**kwargs)

f1.add_tile(tileProvider)
f1.title.text_font_style = 'italic'
f1.title.text_font_size = '14pt'
f1.axis.visible=False 

TA20 = nCases.drop('geometry',axis=1).copy()


point_source_1 = GeoJSONDataSource(geojson=TA20.to_json())
poly_source = GeoJSONDataSource(geojson=cd_gdf.to_json())

Circle1=f1.circle('x','y',size='radius',fill_color='blue',line_color='blue',fill_alpha=0.5,source=point_source_1)
areas = f1.patches('xs','ys',source=poly_source,name="Council Districts",fill_color=None,fill_alpha=0.6,line_color="black",line_width=0.5)

c_hover= HoverTool(renderers=[Circle1])
c_hover.point_policy = "follow_mouse"
c_hover.tooltips=[
                  ("Council Districts","@dist_name"),
                  ("Number of Cases","@counts")]

f1.add_tools(c_hover)



heading = Div(text="""<h1>Point Distribution Map</h1>\
<p> The map below show locations and distribution of crime incident cases in Buffalo.\
<p> Use the tools to the right of map to pan, zoom, etc... \
Hover over a property to see the address and number of cases.</p> \
<p><b><i>Data Source</i></b> =<a href = https://data.cityofnewyork.us/Housing-Development/Housing-Litigations/59kj-x8nc target='_blank'>NYC Open Data.</a></p>.\
<p style="font-size:9px;">Maps created 4/10/2022 by Nguyet Que T. Tran.</p>""", sizing_mode="stretch_both")

layout = column(heading, row(f1),sizing_mode='stretch_both',margin=(5,5,5,5))
show(layout)

The map showing the number of confirmed crime cases by Buffalo Council Districts. The center of each council districts polygon boundary is used to represent the total number of confirmed crime cases within each council districts. The higher the number, the larger the circle size.

In summary, Ellicott is the council district that have the highest number of crime cases - 45,497 cases. While South council district have the lowest number of cases - 17,308 cases. However, almost haft of South council district area is parks/places without human physical addresses where none crime cases recorded in this dataset. So that we cannot conclude that South is safest council district in Buffalo.

Moreover, Delaware council districh have the second lowest number of cases - 18,578. And again, its area have a big park.

So that, in oder to tell the dangerous level of council districts, we will need to draw a point frequency map of all crime cases at duplicated locations.

Point Frequency Map

In [243]:
Allcases = allHl.loc[allHl.parent_incident_type	!= None ].copy()
Allcases.to_crs('epsg:3857',inplace=True)
Allcases.tail()
Out[243]:
index council_district city neighborhood_1 police_district latitude parent_incident_type state geoid20_block day_of_week tractce20 incident_description geoid20_tract census_block longitude census_block_2010 census_block_group census_block_group_2010 census_tract hour_of_day created_at address_1 geoid20_blockgroup incident_type_primary case_number census_tract_2010 incident_datetime geometry conrank counts
8540 274958 NORTH Buffalo Elmwood Bidwell District D 42.934 Theft of Vehicle NY 360290035021000 Friday 980000 Buffalo Police are investigating this report o... 36029980000 1000 -78.88 1000 1 1 9800 21 2014-12-06T07:05:00 E DIST 360290001101 UUV 14-3390890 62.01 2014-12-05T21:55:00 POINT (-8780881.434 5301931.363) gray 1
8541 276157 DELAWARE Buffalo North Park District D 42.959 Robbery NY 360290035021006 Sunday 005100 Buffalo Police are investigating this report o... 36029005100 1006 -78.851 1006 1 1 51 23 2010-11-02T06:01:00 900 Block KENMORE AV 360290001101 ROBBERY 10-3041077 51 2010-10-31T23:45:00 POINT (-8777653.169 5305733.310) gray 1
8542 276386 ELLICOTT Buffalo Central District B 42.884 Theft NY 360290025021007 Monday 007102 Buffalo Police are investigating this report o... 36029007102 1007 -78.883 1011 1 1 71.02 23 2013-11-28T07:02:00 100 Block CHURCH ST 360290001101 LARCENY/THEFT 13-3220987 71.02 2013-11-18T23:00:00 POINT (-8781215.392 5294332.098) gray 1
8543 276701 NORTH Buffalo Black Rock District D 42.946 Breaking & Entering NY 360290035021003 Saturday 005700 Buffalo Police are investigating this report o... 36029005700 1003 -78.897 1009 1 1 57 23 2019-09-24T22:47:00 1 Block RIVER ROCK DR 360290001101 BURGLARY 06-2871431 57 2006-10-14T23:01:00 POINT (-8782773.865 5303756.105) gray 1
8544 277013 FILLMORE Buffalo First Ward District A 42.868 Theft NY 360290165001019 Thursday 016400 Buffalo Police are investigating this report o... 36029016400 1019 -78.869 1025 1 1 164 10 2015-05-09T06:06:00 300 Block OHIO ST 360290001101 LARCENY/THEFT 15-1280373 164 2015-05-07T10:53:00 POINT (-8779656.919 5291901.635) gray 1
In [244]:
maxcir = 60
maxcnt = Allcases.counts.max()
Allcases['radius']=(Allcases.counts/maxcnt*maxcir)
Allcases['radius']=Allcases['radius'].astype(float).round().astype(int)
Allcases.head()
Out[244]:
index council_district city neighborhood_1 police_district latitude parent_incident_type state geoid20_block day_of_week tractce20 incident_description geoid20_tract census_block longitude census_block_2010 census_block_group census_block_group_2010 census_tract hour_of_day created_at address_1 geoid20_blockgroup incident_type_primary case_number census_tract_2010 incident_datetime geometry conrank counts radius
0 0 LOVEJOY Buffalo Schiller Park District E 42.918 Theft NY 360290037001005 Thursday 003700 Buffalo Police are investigating this report o... 36029003700 1005 -78.804 1005 1 1 37 7 2022-04-21T07:08:28 0 Block ROGERS AV 360290037001 LARCENY/THEFT 22-1110134 37 2022-04-21T07:08:28 POINT (-8772421.152 5299498.928) gray 66 4
1 1 UNIVERSITY Buffalo Kensington-Bailey District E 42.939 Theft NY 360290044014004 Thursday 004401 Buffalo Police are investigating this report o... 36029004401 4004 -78.8 4004 4 4 44.01 23 2022-04-21T23:28:41 400 Block EGGERT RD 360290044014 LARCENY/THEFT 22-1110924 44.01 2022-04-21T17:30:41 POINT (-8771975.875 5302691.629) gray 36 2
2 2 NORTH Buffalo North Park District D 42.952 Theft NY 360290050003005 Friday 005000 Buffalo Police are investigating this report o... 36029005000 3005 -78.876 3010 3 3 50 15 2022-04-22T15:18:07 1900 Block ELMWOOD AV 360290050003 LARCENY/THEFT 22-1120542 50 2022-04-22T15:18:07 POINT (-8780436.156 5304668.609) gray 77 4
3 3 FILLMORE Buffalo First Ward District A 42.868 Theft NY 360290005001000 Friday 000500 Buffalo Police are investigating this report o... 36029000500 1000 -78.85 1000 1 1 5 22 2022-04-22T22:25:34 800 Block S PARK AV 360290005001 LARCENY/THEFT 22-1120960 5 2022-04-22T22:00:34 POINT (-8777541.849 5291901.635) gray 25 1
4 4 FILLMORE Buffalo Ellicott District A 42.874 Theft NY 360290164001003 Sunday 016400 Buffalo Police are investigating this report o... 36029016400 1003 -78.871 1011 1 1 164 2 2022-04-24T02:01:30 0 Block FULTON ST 360290164001 LARCENY/THEFT 22-1140104 164 2022-04-24T00:58:30 POINT (-8779879.558 5292812.985) gray 529 31
In [245]:
Allcases.to_crs('epsg:3857',inplace=True)

f1 = figure(title = "Location of crime cases in Buffalo", tools=TOOLS, toolbar_sticky=False,**kwargs)

f1.add_tile(tileProvider)
f1.title.text_font_style = 'italic'
f1.title.text_font_size = '14pt'
f1.axis.visible=False 

point_source_1 = GeoJSONDataSource(geojson=Allcases.to_json())
poly_source = GeoJSONDataSource(geojson=cd_gdf.to_json())

Circle1=f1.circle('x','y',size='radius',fill_color='blue',line_color='blue',fill_alpha=0.5,source=point_source_1)
areas = f1.patches('xs','ys',source=poly_source,name="Council Districts",fill_color=None,fill_alpha=0.6,line_color="red",line_width=0.9)

c_hover= HoverTool(renderers=[Circle1])
c_hover.point_policy = "follow_mouse"
c_hover.tooltips=[("Address","@address_1," "@council_district"),
                  ("   " , "    "),
                  ("Number of Cases","@counts")]

f1.add_tools(c_hover)

heading = Div(text="""<h1>All Crimes Point Frequency Map</h1>\
<p> The map below show locations and frequencies of all crime cases in Buffalo by council district.\
<p> Use the tools to the right of each map to pan, zoom, etc... \
Hover over a property to see the address and number of cases.</p> \
<p><b><i>Data Source</i></b> =<a href = https://data.cityofnewyork.us/Housing-Development/Housing-Litigations/59kj-x8nc target='_blank'>NYC Open Data.</a></p>.\
<p style="font-size:9px;">Maps created 4/10/2022 by Nguyet Que T. Tran.</p>""", sizing_mode="stretch_both")

layout = column(heading, row(f1),sizing_mode='stretch_both',margin=(5,5,5,5))
show(layout)

Conclusion

Base on the map above, we can see the frequency and level of crimes that are happened.

According to the size and frequency of circles, we can conclude that Ellicott is the most dangerous council district in Buffalo.

And Delaware seems safer than South with smaller sizes of circles. More over, combine with Homicide map, there was no homicide case occurred in Delaware while in South there were location that has 62 homicide cases. So, Delaware is the safest council district in Buffalo.